Misannotated Multi-Nucleotide Variants in Public Cancer Genomics Datasets Lead to Inaccurate Mutation Calls with Significant Implications

نویسندگان

چکیده

Abstract Although next-generation sequencing is widely used in cancer to profile tumors and detect variants, most somatic variant callers these pipelines identify variants at the lowest possible granularity, single-nucleotide (SNV). As a result, multiple adjacent SNVs are called individually instead of as multi-nucleotide (MNV). With this approach, amino acid change from individual SNV within codon could be different based on MNV that results combining SNV, leading incorrect conclusions about downstream effects variants. Here, we analyzed 10,383 call files (VCF) Cancer Genome Atlas (TCGA) found 12,141 incorrectly annotated MNVs. Analysis seven commonly mutated genes 178 studies cBioPortal revealed MNVs were consistently missed 20 studies, whereas they correctly 15 more recent studies. At BRAF V600 locus, common example MNV, several public datasets reported separate V600E V600M single merged V600K variant. VCFs TCGA Mutect2 caller develop solution merge MNV. Our custom script phasing information VCF determined whether same needed into before annotation. This study shows institutions performing NGS for genomics should incorporate step merging best practice their pipelines. Significance: Identification mutation calls TCGA, including clinically relevant KRAS G12, will influence research potentially clinical decisions.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Significant Associations of the rs3104413 Single-nucleotide Polymorphism in the HLA Region with Type 1 Diabetes

Background and Aims: In this study, the effect of rs310441 polymorphism in the human leukocyte antigen (HLA) region on the development of susceptibility or resistance to Type 1 diabetes (T1D) among the people with T1D compared to healthy subjects has been investigated. Materials and Methods: This research, which is based on the examination of 130 cases with T1D and 98 controls, has been carrie...

متن کامل

clinical implications of braf mutation test in colorectal cancer

knowledge about the clinical significance of v-raf murine sarcoma viral oncogene homolog b1 ( braf ) mutations in colorectal cancer (crc) is growing. braf encodes a protein kinase involved with intracellular signaling and cell division. the gene product is a downstream effector of kirsten ras 1( kras ) within the ras/raf/mapk cellular signaling pathway. evidence suggests that braf mutations, li...

متن کامل

Not all neuroligin 3 and 4X missense variants lead to significant functional inactivation

INTRODUCTION Neuroligins are postsynaptic cell adhesion molecules that interact with neurexins to regulate the fine balance between excitation and inhibition of synapses. Recently, accumulating evidence, involving mutation analysis, cellular assays, and mouse models, has suggested that neuroligin (NLGN) mutations affect synapse maturation and function. Previously, four missense variations [p.G4...

متن کامل

Translating genomics to the clinic: implications of cancer heterogeneity.

BACKGROUND Sequencing of cancer genomes has become a pivotal method for uncovering and understanding the deregulated cellular processes driving tumor initiation and progression. Whole-genome sequencing is evolving toward becoming less costly and more feasible on a large scale; consequently, thousands of tumors are being analyzed with these technologies. Interpreting these data in the context of...

متن کامل

Proteogenomic analysis prioritises functional single nucleotide variants in cancer samples

Massively parallel DNA sequencing enables the detection of thousands of germline and somatic single nucleotide variants (SNVs) in cancer samples. The functional analysis of these mutations is often carried out through in silico predictions, with further downstream experimental validation rarely performed. Here, we examine the potential of using mass spectrometry-based proteomics data to further...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Cancer Research

سال: 2021

ISSN: ['1538-7445', '0008-5472']

DOI: https://doi.org/10.1158/0008-5472.can-20-2151